Reasoning  about  Human  Participation  in 
Self-Adaptive  Systems 


Javier  Camara 
Institute  for  Software  Research 
Carnegie  Mellon  University 
Pittsburgh,  PA  15213,  USA 
Email:  jcmoreno@cs.cmu.edu 


Gabriel  A.  Moreno 
Software  Engineering  Institute 
Carnegie  Mellon  University 
Pittsburgh,  PA  15213,  USA 
Email:  gmoreno@sei.cmu.edu 


David  Garlan 

Institute  for  Software  Research 
Carnegie  Mellon  University 
Pittsburgh,  PA  15213,  USA 
Email:  garlan@cs.cmu.edu 


Abstract — Self-adaptive  systems  overcome  many  of  the  lim¬ 
itations  of  human  supervision  in  complex  software-intensive 
systems  by  endowing  them  with  the  ability  to  automatically 
adapt  their  structure  and  behavior  in  the  presence  of  runtime 
changes.  However,  adaptation  in  some  classes  of  systems  (e.g., 
safety-critical)  can  remarkably  benefit  by  receiving  information 
from  humans  (e.g.,  acting  as  sophisticated  sensors,  decision¬ 
makers),  or  by  involving  them  as  system-level  effectors  to  execute 
adaptations  (e.g.,  when  automation  is  not  possible,  or  as  a  fallback 
mechanism).  However,  human  participants  are  influenced  by 
factors  external  to  the  system  (e.g.,  training  level,  fatigue)  that 
affect  the  likelihood  of  success  when  they  perform  a  task,  its 
duration,  or  even  if  they  are  willing  to  perform  it  in  the  first 
place.  Without  careful  consideration  of  these  factors,  how  to 
decide  when  to  involve  humans  in  adaptation,  and  in  which 
way?  In  this  paper,  we  investigate  the  problem  of  how  the  explicit 
modeling  of  human  participants  can  provide  a  better  insight  into 
the  trade-offs  of  involving  humans  in  adaptation.  We  contribute 
a  formal  framework  to  reason  about  human  involvement  in  self¬ 
adaptation,  focusing  on  the  role  of  human  participants  as  actors 
(i.e.,  effectors)  during  the  execution  stage  of  adaptation.  The 
approach  consists  of:  (i)  a  language  able  to  express  adaptation 
models  that  capture  factors  affecting  human  behavior  and  its 
interactions  with  the  system,  and  (ii)  a  formalization  of  these 
adaptation  models  as  stochastic  multiplayer  games  (SMGs)  that 
can  be  used  to  analyze  human-system-environment  interactions. 
We  illustrate  our  approach  in  an  adaptive  industrial  middleware 
used  to  monitor  and  manage  sensor  networks  in  renewable  energy 
production  plants. 

I.  Introduction 

Software-intensive  systems  are  progressively  more  relied 
upon  to  support  a  wide  variety  of  tasks  in  modern  society. 
Still,  the  ability  of  these  systems  to  provide  service  that  can 
justifiably  be  trusted  in  the  presence  of  changes,  either  in 
their  environment  (e.g.,  resource  availability)  or  the  system 
itself  (e.g.,  faults)  is  increasingly  affected  by  their  growing 
complexity,  as  well  as  by  the  dynamic  and  unpredictable  nature 
of  the  environments  in  which  they  typically  have  to  operate. 

Early  attempts  to  address  this  situation  involved  human 
supervision,  which  is  expensive  and  unreliable,  due  to  the 
fallible  nature  of  humans  and  their  difficulty  to  react  in  a 
timely  manner. 

Over  the  last  decade,  self-adaptive  systems  [11],  [21]  have 
emerged  as  an  alternative  that  overcomes  many  of  the  limi¬ 
tations  of  human  supervision  by  endowing  systems  with  so¬ 
phisticated  mechanisms  to  automatically  adapt  their  structure 


and  behavior  at  runtime.  Self-adaptation  approaches  usually 
rely  exclusively  on  closed-loop  controllers  (e.g.,  implementing 
the  MAPE-K  model  [21])  that  eliminate  human  intervention 
from  adaptation.  Although  these  fully  automated  approaches 
have  proven  successful  in  different  application  domains,  there 
are  situations  in  which  they  may  be  suboptimal,  or  simply  do 
not  suffice  to  guarantee  adequate  service  provision  (e.g.,  when 
adaptations  involve  physical  changes  to  the  system  that  cannot 
be  automated). 

In  particular,  the  different  activities  of  the  MAPE-K  loop  in 
some  classes  of  systems  (e.g.,  safety-critical)  and  application 
domains  can  remarkably  benefit  from  human  involvement  by: 

(i)  receiving  information  difficult  to  automatically  monitor  or 
analyze  from  humans  acting  as  sophisticated  sensors  (e.g., 
indicating  whether  there  is  an  ongoing  anomalous  situation), 

(ii)  incorporating  human  input  into  the  decision-making  pro¬ 
cess  to  have  better  insight  about  the  best  way  of  adapting  the 
system,  or  (iii)  employing  humans  as  system-level  effectors  to 
execute  adaptations  (e.g.,  in  cases  in  which  full  automation  is 
not  possible,  or  as  a  fallback  mechanism). 

However,  the  behavior  of  human  participants  is  typically 
influenced  by  factors  external  to  the  system  itself  (e.g.,  train¬ 
ing  level,  stress,  fatigue)  that  determine  their  likelihood  of 
succeeding  at  carrying  out  a  particular  task,  how  long  it  will 
take,  or  even  if  they  are  willing  to  perform  it  in  the  first  place. 
Without  careful  consideration  of  all  these  factors,  how  can  the 
system  decide  when  to  involve  humans  in  adaptation,  and  in 
which  way? 

Answering  these  questions  demands  new  approaches  to 
systematically  reason  about  how  the  behavior  of  human  par¬ 
ticipants,  incorporated  as  integral  system  elements,  affects  the 
outcome  of  adaptation. 

In  this  paper,  we  investigate  the  problem  of  how  the  explicit 
modeling  of  human  factors  can  provide  a  better  insight  into 
the  trade-offs  of  involving  humans  in  adaptation.  In  particular, 
we  contribute  a  formal  framework  to  reason  about  human 
involvement  in  self-adaptation,  focusing  on  the  role  of  human 
participants  as  actors  (i.e.,  effectors)  during  the  execution 
stage  of  adaptation.  The  backbone  of  the  approach  consists 
of  two  parts:  (i)  an  extended  version  of  a  language  to  express 
adaptation  models  with  elements  that  capture  some  of  the  main 
factors  that  affect  human  behavior  and  its  interactions  with 
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the  system,  and  (ii)  a  formalization  of  the  extended  adaptation 
models  as  stochastic  multiplayer  games  (SMGs)  that  can  be 
used  to  analyze  human-system-environment  interactions. 

To  explore  these  issues,  we  propose  the  extension  of  the 
Stitch  language  [12]  employed  in  the  Rainbow  framework  for 
self-adaptation  [19]  with  elements  inspired  from  opportunity- 
willingness-capability  models  employed  in  cyber-human  sys¬ 
tems  (CHS)  [17]  that  capture  major  factors  having  an  influence 
in  human  participant  behavior.  We  illustrate  our  approach  in 
the  context  of  an  adaptive  industrial  middleware  that  is  used 
to  monitor  and  manage  highly  populated  networks  of  devices 
in  renewable  energy  production  plants. 

This  paper  is  organized  as  follows.  In  Section  II,  we 
present  DCAS,  the  system  that  we  employ  to  illustrate 
our  approach.  Section  III  introduces  related  work  in  the 
areas  of  self-adaptive  systems,  business  process  modeling, 
and  cyber-human  systems,  outlining  some  of  the  major 
requirements  for  the  approach.  Section  IV  details  the  formal 
basis  for  mixed-initiative  adaptation  in  the  context  of  Stitch. 
In  Section  V  we  show  how  the  formal  modeling  of  human- 
in-the-loop  adaptation  in  Rainbow  can  be  leveraged  through 
the  use  of  probabilistic  model  checking  of  SMGs  to  reason 
about  the  combined  behavior  of  human  participants  with  a 
self-adaptive  system.  Finally,  Section  VI  presents  conclusions 
and  directions  for  future  work. 

II.  Motivating  Scenario 

The  Data  Acquisition  and  Control  Service  (DCAS)  [5]  is 
a  middleware  from  Critical  Software  that  provides  a  reusable 
infrastructure  to  manage  the  monitoring  of  highly  populated 
networks  of  devices  equipped  with  sensors.  In  particular, 
the  middleware  is  designed  to  be  seamlessly  integrated  with 
Critical’s  Energy  Management  System  (csEMS)1,  which  is  a 
platform  that  provides  asset  management  support  for  power 
producing  companies  based  on  renewable  energy  sources.  The 
overall  csEMS  architecture  aims  at  high  scalability,  flexibility 
and  customization  with  management  capabilities  that  enable 
the  operation  of  control  centers  independently  of  the  underly¬ 
ing  application  (e.g.,  wind,  solar,  etc.). 


Fig.  1.  Architecture  of  a  DCAS-based  system. 

1 )  Overview  of  the  Architecture:  The  basic  building  blocks 
in  a  DCAS-based  system  (Figure  1)  are:  2 


1  http://solutions.criticalsoftware.com/products_services/csems/ 

-  We  herein  consider  a  simplified  version  of  the  DCAS  architecture.  Further 
details  about  DCAS  can  be  found  in  [5]. 


•  Devices  are  equipped  with  one  or  more  sensors  to  obtain 
data  from  the  application  domain  (e.g.,  from  wind  towers, 
solar  panels,  etc.).  Each  sensor  has  an  associated  data 
stream  from  which  data  can  be  read.  Each  type  of  device 
has  its  particular  characteristics  (e.g.,  data  polling  rate,  or 
expected  value  ranges)  specified  in  a  device  profile. 

•  Processor  nodes  pull  data  from  the  devices  at  a  rate 
configured  in  the  device  profile,  and  dispatch  this  data 
to  the  database  server.  Each  processor  node  includes  a 
set  of  processes  called  Data  Requester  Processor  Pollers 
(DRPPs  or  pollers,  for  short)  responsible  for  retrieving 
data  from  the  devices.  Communication  between  DRPPs 
and  devices  is  synchronous,  so  DRPPs  remain  blocked 
until  devices  respond  to  data  requests  or  a  timeout  ex¬ 
pires.  This  is  the  main  performance  bottleneck  of  DCAS. 

•  Database  server  stores  the  data  collected  from  devices 
by  processor  nodes. 

•  Application  server  is  connected  to  the  database  server 
to  obtain  data,  which  can  be  presented  to  the  operators 
of  the  system  or  processed  automatically  by  application 
software.  However,  DCAS  is  application-agnostic,  so  the 
application  server  will  not  be  discussed  in  the  remainder 
of  this  paper. 

The  main  objective  of  DCAS  is  collecting  data  from  the 
connected  devices  at  a  rate  as  close  as  possible  to  the  one 
configured  in  their  device  profiles,  while  making  an  efficient 
use  of  the  computational  resources  in  the  processor  nodes. 
Specifically,  the  primary  concern  in  DCAS  is  providing  service 
while  maintaining  acceptable  levels  of  performance,  measured 
in  terms  of  processed  data  requests  per  second  (rps)  inserted 
in  the  database,  while  the  secondary  concern  is  optimizing  the 
cost  of  operating  the  system,  which  is  mapped  to  the  number 
of  active  processor  nodes  (i.e.,  each  active  processor  node  has 
a  fixed  operation  cost  per  time  unit). 

2 )  Adaptation  Mechanisms:  DCAS  implements  three  adap¬ 
tation  mechanisms  to  keep  an  acceptable  level  of  performance 
while  making  an  efficient  use  of  computational  resources: 

•  Rescheduling  aims  at  avoiding  performance  degradation 
caused  by  devices  that  fail  to  respond  in  a  timely  manner 
when  polled.  It  consists  in  decreasing  the  priority  of  the 
data  streams  associated  with  the  failing  devices,  so  that 
they  are  polled  less  often  (thus  reducing  the  average  time 
that  DRPPs  remain  blocked  waiting  for  device  data). 

•  Scale  up  aims  at  improving  performance  by  exploiting  as 
much  as  possible  CPU  and  memory  in  active  processor 
nodes  by  ( deactivating  DRPPs  as  required. 

•  Scale  out  aims  at  maintaining  an  acceptable  performance 
level  when  new  devices  are  connected  to  the  network 
at  runtime  and  all  available  resources  in  the  set  of 
active  processor  nodes  are  already  being  used.  This  is 
achieved  by  dynamically  activating  new  processor  nodes, 
according  to  the  demand  determined  by  the  new  system’s 
workload  and  operating  conditions.  Scale  out  can  be 
performed  in  two  different  ways: 


-  As  a  manual  process  carried  out  by  a  human  operator. 
This  is  a  slow  and  demanding  process,  in  which  a 
new  processor  node  must  be  manually  deployed,  and 
devices  re-attached  across  the  different  already  active 
processor  nodes,  according  to  the  particular  situation. 
Specifically,  the  process  followed  by  a  human  operator 
to  perform  scale  out  consists  of  the  following  steps: 

1)  determine  which  (possibly  new)  devices  need  to  be 
attached  to  a  processor  node, 

2)  decide  which  of  those  devices  can  be  attached  to 
a  currently  active  processor  node,  and  which  must 
be  attached  to  a  new  one, 

3)  restart  active  processor  nodes  that  have  been  as¬ 
signed  new  devices,  and 

4)  deploy  and  activate  new  processor  node(s). 

-  As  an  automated  process  that  can  be  executed  by 
adaptation  logic  residing  in  an  closed  control  loop.3 
Although  this  process  is  faster  than  the  manual  version 
of  scale  out  and  does  not  require  any  human  interven¬ 
tion,  it  is  less  effective  in  terms  of  exploiting  system 
resources,  since  only  new  devices  can  be  attached  to 
the  new  (pre-deployed)  processor  nodes  being  activated 
(i.e.,  devices  already  attached  to  other  processor  nodes 
cannot  be  re-attached,  restricting  the  space  of  target 
configurations  that  the  system  can  adopt  with  respect 
to  the  manual  variant  of  scale  out). 

We  illustrate  our  approach  by  focusing  exclusively  on  the 
trade-offs  of  employing  the  two  variants  of  scale  out,  since 
scale  up  and  rescheduling  are  fully  automated  adaptations 
carried  out  locally  on  each  processor  node,  according  to  the 
changing  conditions  of  the  devices  that  they  are  attached  to. 

III.  Related  Work 

Deciding  whether  humans  should  be  involved  in  the  execu¬ 
tion  of  adaptation  is  no  easy  task,  since  their  behavior  and  the 
outcome  of  their  actions  can  be  affected  by  transient  factors 
such  as  changing  levels  of  attention  and  load,  fixed  attributes 
(e.g.,  level  of  expertise  in  carrying  out  a  particular  task),  or 
even  their  physical  context  (e.g.,  access  to  different  locations, 
timing  issues).  These  factors  constitute  an  additional  source  of 
uncertainty  affecting  the  self-adaptive  system  (acknowledged 
by  Esfahani  and  Malek  as  uncertainty  due  to  human  in  the 
loop  [16])  that  needs  to  be  managed  if  we  want  to  answer  the 
following  questions: 

Ql:  How  can  the  outcome  of  adaptation  be  predicted  if  human 
actors  are  involved  in  its  execution? 

Q2:  How  can  it  be  determined  whether  human  actors  should 
be  involved  in  adaptation? 

While  answering  Ql  calls  for  employing  models  of  human 
characteristics  sufficient  for  representing  the  probabilistic  na¬ 
ture  of  human  behavior  and  its  interaction  with  the  system, 
Q2  also  demands  exploring  mechanisms  suitable  to  compare 

3  In  this  paper,  we  refer  to  an  automated  version  of  scale  out  implemented 
on  a  Rainbow-based  prototype  of  DCAS  [4], 


candidate  solutions  that  might  include  human-system  collabo¬ 
rations,  as  well  as  fully  automated  adaptations. 

Some  existing  approaches  in  self-adaptation  that  automat¬ 
ically  generate  adaptation  plans  at  runtime  are  able  to  rank 
candidate  solutions  by  analyzing  trade-offs  among  different 
qualities  [26]  or  consider  uncertainty  when  tuning  the  oper¬ 
ation  of  the  system  (e.g.,  by  dynamically  adjusting  parame¬ 
ters  [3],  [15]).  However,  there  are  no  approaches  to  the  best 
of  our  knowledge  able  to  rank  candidate  solutions  by  factoring 
in  uncertainty,  rendering  these  approaches  insufficient  for 
generating  adaptation  plans  involving  humans. 

Other  approaches  in  self-adaptation  that  rely  on  selection 
of  adaptation  strategies  defined  by  a  designer  at  development 
time  [19],  [25]  are  also  able  to  rank  candidate  solutions  by  an¬ 
alyzing  trade-offs  among  different  quality  concerns.  Moreover, 
these  approaches  are  sometimes  able  to  deal  with  some  aspects 
of  uncertainty  and  timing  [19].  However,  these  proposals  are 
limited  to  ranking  and  selection  of  fully  automated  adaptations, 
since  the  knowledge  models  they  employ  are  unable  to  capture 
the  multiple  facets  of  uncertainty  derived  from  human  behavior 
that  affect  the  outcome  of  adaptations. 

While  all  the  aforementioned  approaches  focus  exclusively 
on  fully  automated  adaptations,  Dorn  and  Taylor  [14]  intro¬ 
duce  a  framework  that  enables  a  system  adaptation  manager  to 
reason  about  the  effects  of  software-level  changes  on  human 
interactions  and  vice-versa  by  mapping  a  model  of  what  they 
describe  as  human  architecture  (described  in  a  language  called 
hADL)  to  a  model  of  the  system’s  architecture  updated  at 
runtime.  This  approach  focuses  on  the  collaboration  topology 
and  is  able  to  rank  collaboration-(un)aware  adaptations  to 
select  the  best  course  of  action,  although  it  does  not  explicitly 
consider  uncertainty  derived  from  human  behavior  as  a  major 
factor  affecting  the  outcome  of  adaptations. 

Outside  of  the  scope  of  self-adaptive  systems,  some  ap¬ 
proaches  in  the  business  process  modeling  domain  include 
some  aspects  of  human  involvement,  providing  constructs  for 
describing  human  activities  in  business  processes  and  their 
dependencies  [27],  [13],  These  languages  target  primarily 
service-oriented  architectures  and  have  limited  or  no  support 
for  other  common  architectural  styles. 

Eskins  and  Sanders  [17]  introduce  a  definition  of  a 
cyber-human  system  (CHS)  and  the  opportunity-willingness- 
capability  (OWC)  ontology  for  classifying  CHS  elements  with 
respect  to  system  tasks.  This  approach  provides  a  structured 
and  quantitative  means  of  analyzing  cyber  security  problems 
whose  outcomes  are  influenced  by  human-system  interactions, 
reflecting  the  probabilistic  nature  of  human  behavior. 

If  we  contrast  questions  Ql  and  Q2  with  the  characteristics 
of  the  related  approaches  described  in  this  section,  we  can  list 
a  set  of  requirements  that  a  suitable  approach  to  our  problem 
should  satisfy: 

Rl:  The  approach  must  include  a  value  system  that  enables 
candidate  solution  ranking,  allowing  context-sensitive 
adaptation. 

R2:  The  approach  must  be  able  to  consider  uncertainty  as  a 


TABLE  I 

Requirements  satisfied  by  related  approaches. 


Area 

Approach 

R1 

R2 

R3 

R4 

R5 

Self-adaptive  Systems 

Sykes  et  al.  [26] 

/ 

/ 

Calinescu  et  al.  [3] 

/ 

/ 

Epifani  et  al.  [15] 

/ 

/ 

Cheng  et  al.  [19] 

/ 

/ 

/ 

/ 

Oreizy  et  al.  [25] 

/ 

/ 

Dorn  &  Taylor  [14] 

/ 

/ 

/ 

Business  Process 
Modelling 

BPMN  [27] 

/ 

/ 

W SBPEL4People  [13] 

/ 

/ 

Cyber-Human  Systems 

Eskins  &  Sanders  [17] 

/ 

/ 

primary  factor  that  conditions  the  effectiveness  of  tasks 
or  adaptations. 

R3:  The  approach  must  consider  timing  delays  that  capture 
the  notion  of  task  or  adaptation  latency. 

R4:  The  approach  must  be  able  to  represent  and  enable  the 
analysis  of  human  participant  behavior. 

R5:  The  approach  must  provide  support  for  a  variety  of 
architectural  styles. 

Although  the  approaches  described  in  this  section  partially 
satisfy  these  requirements  (see  Table  III),  in  this  paper  we 
propose  an  approach  that  combines  the  strengths  of  the 
Rainbow  framework  for  self-adaptation  [19]  and  the  OWC 
ontology  described  in  [17].  On  the  one  hand.  Rainbow  includes 
a  value  system  based  on  utility  to  rank  candidate  adaptations, 
explicit  time  delays  to  observe  the  effects  of  adaptation  actions 
executed  upon  the  target  system,  and  is  able  to  account  for 
uncertainty  in  the  selection  of  adaptive  actions.  On  the  other 
hand,  OWC  models  provide  the  concepts  required  to  capture 
human  factors  that  can  influence  adaptation,  some  of  which 
are  of  a  probabilistic  nature. 

In  previous  work  [7],  we  presented  an  analysis  technique 
based  on  model  checking  of  SMGs  to  quantify  the  potential 
benefits  of  employing  different  types  of  algorithms  for  self¬ 
adaptation.  Specifically,  the  paper  shows  how  the  technique 
enables  the  comparison  of  alternatives  that  consider  tactic 
latency  information  for  proactive  adaptation  with  those  that 
are  not  latency-aware.  In  this  paper,  we  apply  this  analysis 
technique  to  the  context  of  human-in-the-loop  adaptation, 
extending  SMG  models  with  elements  that  encode  an  extended 
version  of  Stitch  adaptation  models  with  OWC  constructs. 

IV.  Human-in-the-loop  Adaptation  in  Rainbow 

In  this  section,  we  first  introduce  the  main  concepts  behind 
the  Stitch  language  for  self-adaptation,  illustrating  them  with 
some  examples  in  DCAS.  Then,  we  present  a  candidate  model 
for  quantifying  how  human  involvement  in  the  execution  of 
adaptation  can  affect  system  objectives.  This  model  is  inspired 
by  the  OWC  ontology  described  in  [17].  Finally,  we  describe 
how  the  concepts  behind  Stitch  and  the  proposed  OWC  model 
can  be  combined  to  capture  descriptions  of  adaptations  that 
involve  collaborations  among  the  system  and  human  actors. 


A.  Adaptation  Model 

Although  many  proposals  that  rely  on  closed-loop  control 
exploit  architectural  models  for  adaptation  [19],  [22],  [25],  in 
this  paper  we  use  some  of  the  high-level  concepts  in  Stitch  [12] 
as  a  reference  framework  to  illustrate  our  approach.  Stitch 
is  the  language  employed  by  the  Rainbow  framework  [19] 
to  describe  automated  repairs  based  on  an  architectural  de¬ 
scription  of  the  underlying  target  system.  Rainbow  has  among 
its  distinct  features  an  explicit  architecture  model  of  the 
target  system,  a  collection  of  adaptation  tactics,  and  utility 
preferences  to  guide  adaptation. 

We  assume  a  model  of  adaptation  that  represents  adaptation 
knowledge  using  the  following  high-level  concepts:4 

1 )  Tactic:  is  a  primitive  action  that  corresponds  to  a  single 
step  of  adaptation.  For  instance,  in  DCAS  we  can  specify  pairs 
of  tactics  with  opposing  effects  for  (de)activating  processor 
nodes,  or  scale  up/down  the  number  of  DRPPs  active  in  a 
given  processor  node.  Listing  1  shows  an  example  tactic  for 
activating  a  processor  node  in  DCAS.  Line  7  specifies  the 
applicability  condition,  which  says  that  the  tactic  may  be 
chosen  if  there  is  a  processor  node  available  not  currently 
active  (predicate  defined  in  line  1).  Lines  8-17  specify  the 
action,  which  is  to  select  a  single  processor  node  among 
those  currently  inactive  (line  9),  select  a  set  of  devices  of 
size  MAX_DEVS_PN  (a  constant  that  represents  the  maximum 
number  of  devices  that  can  be  attached  to  a  processor  node) 
among  those  that  are  currently  not  assigned  to  a  processor 
node  (line  10).  Next,  the  selected  devices  are  assigned  to  the 
processor  node  selected  (lines  12-13),  and  finally  the  operation 
to  enable  the  processor  node  is  called  (line  15).  Line  18  states 
that  the  tactic  succeeds  only  if  all  devices  are  attached  to  some 
processor  node.  Note  that  if  the  tactic  starts  with  set.Size(UDevs) 
>  M.MAX_DEVS_PN,  the  tactic  will  not  be  able  to  produce 
its  intended  effect  because  the  selection  of  a  subset  of  size 
M.MAX_DEVS_PN  is  done  only  once  (line  10).  This  failure  exit 
status  for  the  tactic  is  useful  as  feedback  indicating  that  more 
processor  nodes  may  need  to  be  activated. 

Tactics  have  an  associated  cost/benefit  impact  on  the  dif¬ 
ferent  dimensions  of  concern  in  the  system.  Table  II  shows 
the  different  tactics  in  DCAS,  and  their  impacts  on  quality 
dimensions:  activatePN  is  an  automated  tactic  that  activates  a 
previously  deployed  processor  node,  reducing  response  time 
and  increasing  operation  cost,  whereas  addPN  is  carried  out 
by  a  human  actor  and  has  also  a  beneficial  impact  on  perfor¬ 
mance,  and  detrimental  in  cost.  Although  the  impact  on  cost 
is  the  same  as  in  activatePN,  the  impact  on  performance  will 
typically  be  higher,  since  a  human  operator  is  less  constrained 
than  the  automated  approach  to  exploit  the  available  processor 
nodes  by  re-attaching  devices  properly.  However,  the  downside 
of  tactic  addPN  is  that  it  typically  has  a  much  higher  latency 
(i.e.,  the  time  it  takes  since  a  tactic  is  triggered,  until  its  effects 
can  be  observed)  than  activatePN.  Note  that,  to  obtain  both 
the  latency  and  impact  on  the  different  quality  dimensions 

4 We  use  a  simplified  version  of  Stitch  [12]  to  illustrate  the  main  ideas  in 
this  paper. 


1  define  boolean  PNA=exists  n:ProcessorNodeT  in  M.components  | 

In.isActive; 

2  define  set  ActivePNs={select  n:ProcessorNodeT  in  M.components  | 

In.isActive}; 

3  define  set  UDevs={select  d:DeviceT  in  M.components  |  d.location= — 1 } ; 

4  define  boolean  unprocessedDevices=set.Size(UDevs)>0; 

5  ... 

6  tactic  activatePN  ()  { 

7  condition  {PNA} 

8  action  { 

9  set  pn  =  set.RandomSubset  (ActivePNs.l); 

10  set  devs=set.RandomSubset  (UDevs,  M.MAX_DEVS_PN); 

1 1  for  (ProcessorNodeT  n  :  pn)  { 

12  for  (DeviceT  d  :  devs)  { 

13  M.assignDeviceToPN(d,  n.id); 

14  } 

15  M.enablePN(n,  true); 

16  } 

17  } 

18  effect  {lunprocessedDevices;} 

19  } 


Listing  1.  Tactic  for  activating  a  processor  node  in  DCAS. 


of  tactics  in  practice,  the  approach  relies  on  expert  knowl¬ 
edge  or  field  data  about  similar  existing  systems,  although 
nothing  prevents  the  use  of  machine  learning  techniques  to 
obtain  that  information.  Although  in  this  paper  we  consider 
fixed  cost/benefit  impacts  for  illustration  purposes.  Stitch  also 
supports  the  specification  of  sophisticated  impact  models  that 
are  context-sensitive,  and  can  capture  probabilistic  aspects  in 
the  outcome  of  tactic  executions  [6], 


TABLE  II 

Cost/benefit  for  DCAS  tactics. 


Tactic 

Performance  (P) 

|  Cost  (C)  | 

A  Requests  Per  Second  (rps) 

AUp 

A  Operating  Cost  (usd/hr) 

AUC 

activatePN 

+80 

t 

+1.0 

i 

addPN 

+  150 

ft 

+1.0 

l 

2 )  Strategy:  encapsulates  an  adaptation  process,  where  each 
step  is  the  conditional  execution  of  a  tactic.  Strategies  are 
characterized  in  Stitch  as  a  tree  of  condition-action-delay 
decision  nodes,  where  delays  correspond  to  a  time  window 
for  observing  tactic  effects.  System  feedback  (through  the 
dynamically-updated  architectural  model  of  the  system)  is 
used  to  determine  the  next  tactic  at  every  step  during  strategy 
execution. 


1  strategy  ScaleOut 

2  [unprocessedDevices]  { 

3  tOa:  (maxAssignedDevices)  —  >  :  activatePN()@[1 0000]  { 

4  tOaa:  (success) ->done; 

5  tOab;  (fail) ->  TNULL; 

6  } 

7  tOb:  (fail)  ->  TNULL: 

8  } 


Listing  2.  Strategy  for  activating  a  processor  node  automatically. 


Listing  2  shows  the  Stitch  code  for  a  simple  scale  out 
adaptation  strategy  in  DCAS:  line  2  specifies  the  applicability 
condition  that  needs  to  be  satisfied  for  the  strategy  to  be 
eligible  for  execution  (in  this  case,  predicate  unprocessedDe¬ 
vices  indicates  that  there  are  new  devices  in  the  network  that 
need  to  be  attached  to  a  processor  node).  In  the  body  of  the 


strategy,  node  tOa  (line  3)  executes  tactic  activatePN  if  the  guard 
maxAssignedDevices  evaluates  to  true  (i.e.,  if  the  set  of  active 
processor  nodes  are  already  attached  to  the  maximum  number 
of  devices  they  support).  To  account  for  the  delay  in  observing 
the  outcome  of  tactic  execution  in  the  system,  tOa  specifies  a 
time  window  of  10  seconds,  after  which,  if  the  tactic’s  intended 
effect  (as  defined  in  the  tactic  script  -  Listing  1,  line  18)  is 
observed,  successful  tactic  completion  (keyword  success,  line 
4)  leads  to  the  end  strategy  execution  in  normal  conditions 
through  node  tOaa  (keyword  done).  Otherwise,  if  the  intended 
tactic  effect  is  not  observed  after  the  delay  window  expires 
(keyword  fail,  line  5),  the  strategy  exits  with  an  error  status 
via  node  tOab.  An  additional  condition-action-delay  node  (tOb, 
line  7)  also  exits  the  strategy  with  an  error  status  if  the  guard 
of  node  tOa  is  false,  and  tactic  activatePN  cannot  be  executed 
in  the  first  place. 

3)  Utility  Profile:  to  enable  the  selection  of  strategies 
at  runtime,  we  assume  that  adaptation  is  driven  by  utility 
functions  and  preferences,  which  are  sensitive  to  the  con¬ 
text  of  use  and  able  to  consider  trade-offs  among  multiple 
potentially  conflicting  objectives.  The  different  qualities  of 
concern  are  characterized  as  utility  functions  that  map  them  to 
architectural  properties.  Table  IV-A3  summarizes  an  example 
of  utility  functions  for  DCAS.  Function  Up  maps  high  levels  of 
processed  requests  per  second  inserted  in  the  database  ( rps )  to 
high  utility  by  dividing  it  by  the  maximum  level  of  achievable 
rps  in  the  system  rpsmax,  which  is  computed  according  to 
the  number  of  devices  in  the  system  and  the  data  polling  rates 
configured  in  their  device  profiles. 

On  the  contrary,  function  Uc  maps  higher  costs  (derived 
from  the  number  of  active  processor  nodes  -  pn)  to  lower 
utility  values.  Cost  utility  becomes  0  when  pn  reaches  the 
maximum  number  of  available  nodes  pnmax.  Utility  prefer¬ 
ences  capture  business  preferences  over  the  quality  dimen¬ 
sions,  assigning  a  weight  to  each  one  of  them  (e.g.,  weights 
tuup  =  0.8  and  ui(jc  =  0.2  indicate  that  performance  is  the 
main  concern  in  the  system). 

table  m 

Utility  functions  for  DCAS. 


Performance  (P) 

Cost  (C) 

Up  (rps)  =  — '  pa 

7 'P§max 

Uc  (pn)  =  1 - 

p^iinax 

B.  Human  Model 

Attributes  of  human  actors  that  might  affect  interactions 
with  the  system  are  captured  in  a  model  inspired  by  an 
opportunity-willingness-capability  (OWC)  ontology  described 
in  the  context  of  cyber-human  systems  [17].  These  models 
extend  the  description  of  the  underlying  system’s  architecture, 
and  can  incorporate  multiple  human  actor  types  (e.g.,  human 
actor  roles  specialized  in  different  tasks),  each  of  which  can 
have  multiple  instances  (e.g,  operators  with  different  levels  of 
training  in  a  particular  task).  In  particular,  attributes  of  human 
actor  types  can  be  categorized  into: 

1 )  Opportunity:  captures  the  applicability  conditions  of  the 
adaptation  tactics  that  can  be  executed  by  human  actors  upon 


the  target  system,  as  constraints  imposed  on  the  human  actor 
(e.g.,  by  the  physical  context  -  is  there  an  operator  physically 
located  on  site?). 

Example  1:  We  consider  a  tactic  to  have  a  human  actor 
manually  deploy  a  processor  node  (addPN)  when  performing 
scale  out  in  DCAS.  Opportunity  elements  are  0/?acldPN  = 
{ L,B },  where  L  represents  the  operator’s  location,  and  B 
tells  us  whether  the  operator  is  busy  doing  something  else: 

•  L. state  £  {operator  on  location  (ONL),  operator  off 
location  (OFFL)}. 

•  B. state  £  {operator  busy  (OB),  operator  not  busy 
(ONB)}. 

Using  0£,addPN,  we  can  define  an  opportunity  function  for 
the  tactic  /QddPN  =  {L. state  ==  O N L)  ■  (B. state  ==  ONB) 
that  can  be  used  to  constrain  its  applicability  only  to  situations 
in  which  there  is  an  operator  on  location  who  is  not  busy. 

2)  Willingness:  captures  transient  factors  that  might  affect 
the  disposition  of  the  operator  to  carry  out  a  particular  task 
(e.g.,  load,  stamina,  stress).  Continuing  with  our  example, 
willingness  elements  in  the  case  of  the  addPN  tactic  can  be  de¬ 
fined  as  VU.EaddPN  =  {5},  where  S. state  £  [0, 10]  represents 
the  operator’s  stress  level.  A  willingness  function  mapping 
willingness  elements  to  a  probability  of  tactic  completion  can 
be  defined  as  /|$dPN  =  pr\y(S. state),  with  prw  ■  S  — >■  [0, 1]. 

3)  Capability:  captures  the  likelihood  of  successfully  car¬ 
rying  out  a  particular  task,  which  is  determined  by  fixed 
attributes  of  the  human  actor,  such  as  training  level.  In  our 
example,  we  define  capability  elements  as  CEaddpN  =  {T}, 
where  T  represents  the  operator’s  level  of  training  (e.g., 
T. state  £  [0,1]).  We  define  a  capability  function  that  maps 
training  level  to  a  probability  of  successful  tactic  performance 
as  yaddPN  =  prc(T. state),  with  pro  :  T  — >  [0, 1]. 

C.  Integrating  Human  and  Adaptation  Models 

Incorporation  of  OWC  elements  for  adaptation  execution 
in  Stitch  affects  the  specification  of  different  elements  in 
adaptation  tactics  and  strategies. 

1 )  Tactics:  In  tactics  involving  humans,  constraints  that 
affect  the  applicability  of  a  tactic  can  be  derived  either  from 
the  human  model  (opportunity  elements),  or  properties  of  the 
architecture  itself  (e.g.,  are  there  any  available  processor  nodes 
to  deploy?).  In  general,  applicability  conditions  of  these  tactics 
will  be  a  combination  of  both.  In  Listing  3,  we  can  observe 
how  the  condition  block  of  tactic  addPN  (line  5)  combines 
opportunity  elements  from  the  human  model  (operator  on 
location  and  not  busy  -  predicate  ONLNB,  defined  in  line  1), 
with  a  predicate  defined  over  the  properties  of  the  architecture 
(processor  node  available  -  PNA,  defined  in  line  2). 

The  action  block  of  these  tactics  can  execute  automated 
operations,  as  in  the  case  of  tactic  activatePN  (Listing  1),  but 
also  notify  human  actors  to  perform  a  task.  The  action  block  of 
of  tactic  addPN  (Listing  3,  lines  6-8)  first  selects  an  available 
processor  node  (line  6),  and  then  an  available  operator  on 
location  (line  7).  Finally,  it  notifies  the  selected  operator  that 
she  has  to  activate  the  previously  selected  processor  node  via 
a  text  message  communicating  its  id. 


1  define  boolean  ONLNB=exists  o:operatorT  in  M. participants  |  o.onLocation 

&&  lo.busy; 

2  define  boolean  PNA=exists  n:ProcessorNodeT  in  M.components  | 

in.isActive; 

3  ... 

4  tactic  addPN(){ 

5  condition  {ONLNB  &&  PNA;} 

6  action  {pn=Set.RandomSubSet(ActivePNs,1); 

7  ao=Set.RandomSubSet({select  o:operatorT  in  M. participants 

|  o.onLocation  &&  !o.busy},1); 

8  notify(ao,  ”Deploy„processor„node:„%s”,  pn.id);} 

9  effect  {lunprocessedDevices;} 

10  } 


Listing  3.  Tactic  for  adding  a  processor  node  via  human  operator. 


2)  Strategies:  Fully  automated,  as  well  as  tactics  involving 
humans  can  be  combined  to  achieve  better  outcomes  in  adapta¬ 
tion  strategies.  Listing  4  shows  strategy  ScaleOutOp  for  scaling 
out  the  system,  first  by  notifying  an  operator  (via  tactic  addPN, 
line  3)  to  manually  deploy  and  activate  a  processor  node.  If  the 
assigned  time  window  of  50  seconds  expires  and  the  intended 
effect  of  the  tactic  (lunprocessedDevices)  is  not  observed,  the 
strategy  executes  the  automated  activatePN  tactic  as  a  fallback 
mechanism  (line  4). 


1  strategy  ScaleOutOp 

2  [unprocessedDevices]  { 

3  tO  :  (maxAssignedDevices)  ->  addPN()@[50000]  {  //  add  PN,  wait  50s 

4  tOa:  (fail)  ->  :  activatePN()@[10000]  {  //  If  failed,  activate  PN 

5  tOaa:  (success) ->done; 

6  tOab:  (fail)  — >  TNULL; 

^  } 

8  tOb:  (fail)  ->  TNULL; 

9  } 

10  } _ 

Listing  4.  Strategy  combining  automated/manual  tactics  for  activating 
processor  node. 

V.  Reasoning  about  FIuman-in-the-loop 
Adaptation 

When  defining  a  collection  of  adaptation  strategies  and 
their  associated  utility  profile,  we  need  to  guarantee  not  only 
that  the  system  will  carry  out  reasonable  choices  under  all 
possible  circumstances,  but  also  that  the  effect  of  those  choices 
combined  with  the  behavior  of  human  participants  will  have 
a  reasonable  impact  on  business  concerns.  To  provide  such 
guarantees,  we  make  use  of  a  formal  model  based  on  an 
abstraction  of  the  extended  Stitch  profile  for  human-in-the- 
loop  adaptation  presented  in  Section  IV  that  enables  us  to 
reason  about:  (i)  the  choices  made  by  the  adaptation  manager 
for  adaptation  strategy  selection,  and  (ii)  the  impact  of  the 
execution  of  selected  adaptation  strategies  on  the  target  system. 

Our  modeling  approach  for  human-in-the-loop  adaptation 
consists  in  describing  a  stochastic  multiplayer  game  in  which 
we  consider  that  one  of  the  players  is  the  adaptive  system 
(including  both  automatic  mechanisms  and  human  actors) 
and  the  other  is  the  environment  within  which  the  system 
operates.  The  goal  of  the  system  player  is  to  maximize  accrued 
utility  during  the  system’s  execution  (encoded  formally  as  a 
reward  structure),  while  we  consider  the  environment  to  be  an 
antagonistic  player  that  tries  to  minimize  that  same  reward. 


In  the  remainder  of  this  section,  we  first  introduce  some 
background  on  model  checking  SMGs,  the  formal  technique 
that  we  use  to  formally  reason  about  human  involvement 
in  adaptation.  Next,  we  provide  a  description  of  our  DCAS 
SMG  model  implemented  in  the  probabilistic  model-checker 
PRISM-games  [9],  as  well  as  the  analysis  and  results  that  we 
obtained  for  human-in-the-loop  adaptation  in  the  context  of 
DCAS  scale  out. 

A.  Model  Checking  Stochastic  Multiplayer  Games 

Automatic  verification  techniques  for  probabilistic  systems 
have  been  successfully  applied  in  a  variety  of  application 
domains  that  range  from  power  management  or  wireless 
communication  protocols,  to  biological  systems.  In  particular, 
techniques  such  as  probabilistic  model  checking  provide  a 
means  to  model  and  analyze  systems  that  exhibit  stochastic 
behavior,  effectively  enabling  reasoning  quantitatively  about 
probability  and  reward-based  properties  (e.g.,  about  the  sys¬ 
tem’s  use  of  resources,  or  time). 

Competitive  behavior  may  also  appear  in  (stochastic)  sys¬ 
tems  when  some  component  cannot  be  controlled,  and  could 
behave  according  to  different  or  even  conflicting  goals  with 
respect  to  other  components  in  the  system.  In  such  situations, 
a  natural  fit  is  modeling  a  system  as  a  game  between  different 
players,  adopting  a  game-theoretic  perspective.  Automatic 
verification  techniques  have  been  successfully  used  in  this 
context,  for  instance  for  the  analysis  of  security  [23]  or 
communication  protocols  [20], 

Our  approach  to  analyzing  human  involvement  in  adaptation 
builds  upon  a  recent  technique  for  modeling  and  analyzing 
SMGs  [8],  In  this  approach,  systems  are  modeled  as  turn- 
based  SMGs,  meaning  that  in  each  state  of  the  model,  only 
one  player  can  choose  between  several  actions,  the  outcome 
of  which  can  be  probabilistic.  Players  can  either  cooperate  to 
achieve  the  same  goal,  or  compete  to  achieve  their  own  goals. 

The  approach  includes  a  logic  called  rPATL  for  expressing 
quantitative  properties  of  stochastic  multiplayer  games,  which 
extends  the  probabilistic  logic  PATL  [10],  PATL  is  itself  an 
extension  of  ATL  [1],  a  logic  extensively  used  in  multiplayer 
games  and  multiagent  systems  to  reason  about  the  ability 
of  a  set  of  players  to  collectively  achieve  a  particular  goal. 
Properties  written  in  rPATL  can  state  that  a  coalition  of  players 
has  a  strategy5  which  can  ensure  that  either  the  probability  of 
an  event’s  occurrence  or  an  expected  reward  measure  meets 
some  threshold. 

rPATL  is  a  CTL-style  branching-time  temporal  logic  that 
incorporates  the  coalition  operator  ((C))  of  ATL,  combining 
it  with  the  probabilistic  operator  PIxq  and  path  formulae  from 
PCTL  [2],  Moreover,  rPATL  includes  a  generalization  of  the 
reward  operator  Rxx  from  [18]  to  reason  about  goals  related 
to  rewards.  An  extended  version  of  the  rPATL  reward  operator 

5The  term  strategy  employed  in  the  context  of  SMGs  refers  to  a  game 
strategy  (referred  to  also  as  policy  or  adversary)  as  defined  in  [8],  and  should 
not  be  confused  with  Stitch  adaptation  strategies. 


((C)) R[nax=?[F*  (j)\  6  enables  the  quantification  of  the  maxi¬ 
mum  accrued  reward  r  along  paths  that  lead  to  states  satisfying 
state  formula  f  that  can  be  guaranteed  by  players  in  coalition 
C,  independently  of  the  strategies  followed  by  the  rest  of 
players.  An  example  of  typical  usage  combining  the  coalition 
and  reward  maximization  operators  is  ((sys))R|(1tal:^?[Fc  end], 
meaning  “value  of  the  maximum  utility  reward  accumulated 
along  paths  leading  to  an  end  state  that  a  player  sys  can 
guarantee,  regardless  of  the  strategies  of  other  players.” 

Reasoning  about  strategies  is  a  fundamental  aspect  of  model 
checking  SMGs,  which  enables  checking  for  the  existence 
of  a  strategy  that  is  able  to  optimize  an  objective  expressed 
as  a  property  including  an  extended  version  of  the  rPATL 
reward  operator.  The  checking  of  such  properties  also  supports 
strategy  synthesis,  enabling  us  to  obtain  the  corresponding 
optimal  strategy.  An  SMG  strategy  resolves  the  choices  in  each 
state,  selecting  actions  for  a  player  based  on  the  current  state 
and  a  set  of  memory  elements.7 

B.  Formal  Model 

Our  game  is  played  in  turns  by  two  players  that  are  in 
control  of  the  behavior  of  the  environment  and  a  DCAS- 
based  system  (including  human  actors),  respectively.  The  SMG 
model  consists  of  the  following  parts: 

1 )  Player  definition:  Listing  5  illustrates  player  definition 
in  the  SMG.  Player  env  is  in  control  of  all  the  (asynchronous) 
actions  that  the  environment  can  take  (defined  in  the  environ¬ 
ment  module),  while  system  player  sys  controls  all  the  actions 
that  belong  to  the  human  actor  plus  target  system  ensemble, 
whose  behavior  is  specified  in  the  processes  ha_system  (tar¬ 
get  system),  as  well  as  scaleOutOp  and  scaleOut  (adaptation 
strategies  for  manual  and  automatic  scale  out,  respectively). 
Moreover,  the  system  player  controls  the  synchronization  of 
actions  between  scale  out  adaptation  strategies  and  target 
system,  that  represent  the  addPN  and  activatePN  adaptation 
tactics.  Global  variable  turn  (line  4)  is  used  to  explicitly  encode 
alternating  turns  between  the  system  and  environment  players. 


1  player  env  environment  endplayer 

2  player  sys  ha_system,  [addPN],  [activatePN],  scaleOutOp,  scaleOut 

endplayer 

3  const  ENVT=0,  SYST=1 ; 

4  global  turn:[ENVT..SYST]  init  SYST; 

Listing  5.  Player  definition  for  the  DCAS  scale  out  SMG. 

2)  Environment:  controls  the  evolution  of  variables  in  the 
execution  context  that  are  out  of  the  system’s  control.  For  the 
sake  of  simplicity,  we  assume  in  our  model  a  simple  behavior 
of  the  environment  that  only  keeps  track  of  time,  although 
additional  behavior  controlling  other  elements  (e.g,  network 
delay)  can  be  encoded  (please  refer  to  [7]  for  further  details 
illustrating  the  modeling  of  adversarial  environment  behavior 
in  turn-based  SMGs). 

6The  variants  of  F*<p  used  for  reward  measurement  in  which  the  parameter 
*  S  (0,  oo,  c}  indicate  that,  when  <f>  is  not  reached,  the  reward  is  zero,  infinite 
or  equal  to  the  cumulated  reward  along  the  whole  path,  respectively. 

7See  [8]  for  more  details  on  SMG  strategy  synthesis. 


1  const  MAX_TIME;  //  Exercution  time  frame  [0,MAX_TIME] 

2  module  environment 

3  t:[O..MAX_TIME]  init  0; 

4  []  (tum=ENVT)  &  (t<MAX_TIME)  ->  (t’=t+1)  &  (turn'=SYST); 

s  endmodule 


Listing  6.  Environment  module. 


In  Listing  6,  variable  t  (line  3)  keeps  track  of  execution  time 
(the  time  frame  for  the  system’s  execution  is  determined  by  [0, 
MAX_TIME]* * * * * * * 8).  During  its  turn,  the  environment  checks  that  the 
end  of  the  time  frame  for  the  execution  has  not  been  reached 
yet,  and  if  that  is  the  case,  it  increments  the  value  of  t  one 
unit,  yielding  the  turn  to  the  system  player  (line  4)9. 

3)  Human  Model:  Listing  7  shows  the  encoding  of  the 
OWC  elements  corresponding  to  an  operator  in  the  DC  AS 
system.  Opportunity  elements  (line  2)  indicate  whether  the 
operator  is  on  location  and/or  busy.  These  predicates  are  used 
to  guard  the  execution  of  tactic  addPN  in  the  model  (Listing  8, 
line  9).  The  willingness  function  of  the  operator  (line  6) 
is  inversely  proportional  to  the  stress  level  of  the  operator, 
declared  in  line  5.  The  capability  function  (line  9)  penalizes 
training  levels  below  0.4,  whereas  it  rewards  levels  above  0.8. 


1  //  Opportunity  elements 

2  global  op_onLocation:bool  init  true,  op_busy:  bool  init  false; 

3  //  Willingness  elements  and  function 

4  const  MAX_STRESS_LEVEL,  INIT_STRESS_LEVEL; 

5  global  op_stressLevel :  [O..MAX_STRESS_LEVEL]  init 

INIT_STRESS_LEVEL; 

6  formula  op_f_w=(MAX_STRESS_LEVEL— op_stressLevel)  / 

MAX_STRESS_LEVEL; 

7  //  Capability  elements  and  function 

8  const  double  op_trainingLevel; 

9  formula  op_f_c=  op_trainingLevel  >  0.8  ?  1  :  (op_trainingLevel<0.4? 

op_trainingLevel/3  :  op_trainingLevel); 

10  //  Combined  WC  probability  for  tactic  addPN 

11  formula  addPN_wc_prob  =  op_f_c  *  op_f_w; 


Listing  7.  Human  actor  model  encoding  for  a  DCAS  operator. 

4)  System:  The  combined  behavior  of  the  target  system  and 

human  actors  is  described  in  module  ha_system  (Listing  8). 

The  module  incorporates  a  collection  of  variables  encoding 

the  different  system  qualities  of  concern,  as  well  as  the  aspects 
relevant  to  the  applicability  conditions  of  tactics  (e.g.,  values 

of  predicates  used  in  the  condition  block  of  a  tactic).  Lines  2-6 
illustrate  how  the  different  variables  are  initialized: 

•  rps  and  pn  encode  the  performance  and  number  of  active 
processor  nodes  in  the  DCAS  system,  respectively. 

8Constant  values  not  defined  in  the  model  are  provided  as  command-line 
input  parameters  to  the  tool. 

9We  illustrate  our  approach  to  modeling  the  SMG  using  the  syntax  of  the 
PRISM  language  [24]  for  Markov  Decision  Processes  (MDPs),  which  are 
encoded  as  commands: 

[ action ]  guard  —  >  p i  :  ui+  . . .  +  pn  :  un 

Where  guard  is  a  predicate  over  the  model  variables.  Each  update  Ui 
describes  a  transition  that  the  process  can  make  (by  executing  action )  if  the 
guard  is  true.  An  update  is  specified  by  giving  the  new  values  of  the  variables, 
and  has  an  assigned  probability  pi  E  [0,1].  Multiple  commands  with 
overlapping  guards  (and  probably,  including  a  single  update  of  unspecified 
probability)  introduce  local  nondeterminism. 


•  addpn_fail  is  a  flag  that  determines  if  the  addition  of  a 
processor  node  by  a  human  actor  failed. 

•  cnt_addPN  and  cnt_activatePN  are  counters  used  to  keep 
track  of  the  latency  of  tactics  addPN  and  activatePN, 
respectively. 

Moreover,  the  module  includes  commands  that  model  the 
effect  of  executing  the  different  tactics  as  updates  on  its 
variables.  In  particular,  there  are  three  different  commands  per 
tactic  in  the  module.  We  focus  on  tactic  addPN  to  illustrate  how 
tactic  execution  is  modeled: 

•  Tactic  trigger  (line  9).  Triggers  tactic  execution  when: 
(i)  an  operator  is  on  location  and  not  busy,  (ii)  the  number 
of  active  processor  nodes  is  lower  than  the  maximum 
available,  and  (iii)  the  latency  counter  for  the  tactic  is 
zero.  As  a  consequence,  the  operator  is  flagged  as  busy 
and  the  latency  counter  is  activated  (cnt_addPN’=1). 

•  Tactic  latency  counter  update  (line  12).  If  the  tactic 
counter  is  active,  but  still  has  not  reached  the  tactic’s 
latency  value,  the  counter  is  incremented  in  one  unit. 

•  Tactic  completion  (line  15).  When  the  tactic’s  latency 
counter  expires,  the  command  can  either:  (i)  update 
variables  rps  and  pn  according  to  a  successful  activation  of 
a  processor  node  with  probability  addPN_wc_prob  (deter¬ 
mined  by  the  willingness  and  capability  elements  defined 
in  Listing  7,  line  11),  or  (ii)  fail  to  activate  the  node 
with  probability  l-addPN_wc_prob,  flagging  the  failure  on 
variable  addPN_fail.  In  both  cases,  the  latency  counter  is 
reset,  and  the  busy  status  of  the  operator  is  set  to  false. 

The  encoding  used  for  the  activatePN  tactic  (lines  10,13,18) 
follows  the  same  structure,  but  without  any  OWC  elements 
encoded  in  the  guards  or  updates  of  the  commands.  Note  that 
activation  of  a  processor  node  using  this  tactic  is  assumed  to 
be  always  successful. 

Every  command  in  the  module  includes  a  predicate  in  the 
guard  to  ensure  that  the  command  can  only  be  triggered  during 
the  system  player’s  turn  (turn=SYST),  and  an  additional  pred¬ 
icate  in  the  post  state  that  yields  the  turn  to  the  environment 
player  (turn’=ENVT).  Moreover,  an  additional  command  (line 
20)  lets  the  process  progress  without  any  variable  updates 
when  none  of  the  latency  periods  for  the  tactics  are  active. 
Note  that  in  our  model,  we  assume  sequential  execution  of 
tactics  (in  accordance  with  Stitch  semantics). 

5)  Adaptation  logic:  Modules  scaleOutOp  and  scaleOut 
model  the  adaptation  logic  placed  in  the  controller,  according 
to  the  description  of  their  respective  Stitch  strategies  described 
in  Listings  4  and  2.  Each  of  the  commands  corresponds  to  a 
tactic  that  can  be  executed  in  the  target  system  via  synchro¬ 
nization  on  shared  action  names  with  trigger  commands  in  the 
ha_system  module  (Listing  8,  lines  9-10). 

Module  scaleoutOp  (Listing  9)  models  the  variant  of  the 
scale  out  mechanism  that  makes  use  of  a  human  actor.  The 
command  on  line  3  encodes  the  triggering  of  tactic  addPN  l0, 

10We  abstract  away  predicates  unprocessedDevices  and  maxAssigned- 
Devices  (Listing  4,  lines  2,3),  which  we  assume  to  be  true  in  the  scale  out 
scenarios  encoded  in  our  model. 


1  module  ha_system 

2  rps  :  [RPS_MIN..RPS_MAX]  init  INIT_RPS; 

3  pn  :  [PN_MIN..PN_MAX]  init  INIT_PN; 

4  addpn  fail:  bool  init  false; 

5  cnt_addPN  :[0..addPN_settiing]  init  0; 

6  cnt_activatePN  :[0..activatePN_settling]  init  0; 

7 

8  //  Tactic  triggers 

9  [addPN]  (turn=SYST)  &  (op_onLocation)  &  (!op_busy)  &  (pn<PN_MAX)  & 

(cnt_addPN=0)  ->  (cnt_addPN'=1)  &  (op_busy'=true); 

10  [activatePN]  (turn=SYST)  &  (pn<PN_MAX)  &  (cnt_activatePN=0)  -> 

(cnt_activatePN'=1)  &  (turn’=ENVT); 
n  //  Tactic  latency  counter  update 

12  []  (turn=SYST)  &  (cnt_addPN>0)  &  (cnt_addPN<addPN_latency)  -> 

(cnt_addPN'=cnt_addPN+1)  &  (turn'=ENVT); 

13  []  (turn=SYST)  &  (cnt_activatePN>0)  & 

(cnt_activatePN  <activatePN_latency)  -  > 
(cnt_activatePN’=cnt_activatePN+1)  &  (turn’=ENVT); 

14  //  Tactic  completion  (after  latency  period  expires) 

15  []  (turn=SYST)  &  (cnt_addPN=addPN_latency)  —  > 

16  addPN_wc_prob  :  (cnt_addPN’=0)  &  (rps'=addpn_f_p)  & 

(pn'=f_apn)  &  (op_busy'=false)  &  (turn’=ENVT) 
n  +  1  -addPN_wc_prob  :  (cnt_addPN'=0)  &  (addpn_fail’=true) 

&  (op_busy’=false)  &  (turn’=ENVT)  ; 
is  []  (turn=SYST)  &  (cnt_activatePN=activatePN_latency)  -> 

(rps'=activatepn_f_p)  &  (cnt_activatePN’=cnt_activatePN+1)  & 
(pn'=f_apn)  &  (cnt_activatePN’=0)  &  (turn'=ENVT); 

19  //  Do  nothing 

20  []  (turn=SYST)  &  (cnt_addPN=0)  &  (cnt_activatePN=0)  ->  (turn’=ENVT); 

21  endmodule 


Listing  8.  Target  system  extended  with  human  actors  module. 


which  sets  the  value  of  the  timestamp  variable  addPN_trigger_t 
that  indicates  at  which  time  point  the  tactic  was  triggered. 
This  variable  is  used  on  the  guard  of  the  command  encoding 
the  execution  of  activatePN  (line  4)  to  determine  whether  the 
settling  time  for  the  previous  tactic  has  already  expired.  If  this 
is  the  case,  and  the  activation  of  the  new  processor  node  by 
the  human  operator  is  not  successful  (addpn_fail),  the  command 
executes,  triggering  the  activatePN  tactic,  consistently  with  the 
Stitch  code  in  listing  4,  line  4. 


1  module  scaleOutOp 

2  addPN_trigger_t:[O..MAX_TIME]  init  0; 

3  [addPN]  (turn=SYST)  ->  (addPN_trigger_t’=t); 

4  [activatePN]  (turn=SYST)  &  (t=addPN_trigger_t+addPN_settling)  & 

(addpnjail)  ->  true; 

5  endmodule 

6 

7  module  scaleOut 

8  [activatePN]  (turn=SYST)  ->  true; 

9  endmodule 


Listing  9.  ScaleOutOp  and  scaleOut  adaptation  strategy  modules. 


Module  scaleOut  (Listing  9,  line  7)  models  the  automatic 
scale  out  mechanism  with  a  single  command  that  triggers  the 
tactic’s  execution  on  the  target  system. 

6)  Utility  profile:  Utility  functions  and  preferences  are 
encoded  using  formulas  and  reward  structures  that  enable 
the  quantification  of  instantaneous  utility  (Listing  10).  In 
particular,  formulas  quantify  utility  for  every  dimension  of 
concern  (lines  1-2)  according  to  utility  functions  described  in 
Table  IV-A3,  and  a  reward  structure  rGU  (line  4)  weighs  them 
against  each  other  by  using  the  utility  preferences  specified  as 
weights  for  performance  and  cost  (line  3). 


1  formula  uP=(rps/RPS_MAX);  //  Performance 

2  formula  uC=1  —  (pn/PN_MAX); // Cost 

3  const  double  wP  =  0.8,  wC  =  0.2;  //  Utility  preferences 

4  rewards  ”rGU”  turn=SYST  :  wP  *  uP  +  wC  *  uC;  //  Instantaneous  utility 

5  endrewards 

Listing  10.  Utility  profile  for  DCAS  SMG. 


C.  Analysis 

SMG  models  of  human-in-the-loop  adaptation  can  be  ex¬ 
ploited  to  determine:  (i)  the  expected  outcome  of  human 
involvement  in  adaptation,  and  (ii)  the  conditions  under  which 
such  involvement  improves  over  fully  automated  adaptation. 
To  answer  these  questions,  we  make  use  of  rPATL  specifica¬ 
tions  that  include  reward-specific  operators  aimed  at  checking 
quantitative  properties  over  SMG  models.  Specifically,  our 
technique  enables  us  to  statically  analyze  a  particular  region 
of  the  state  space,  which  first  has  to  be  discretized  to  check 
rPATL  properties.  Obtaining  the  results  of  the  analysis  for 
each  state  in  the  discrete  set  requires  an  independent  run  of 
the  model  checker  in  which  model  parameters  are  instantiated 
with  variable  values  corresponding  to  that  state. 

1 )  Strategy  Utility:  The  expected  utility  value  of  an  adap¬ 
tation  strategy  (potentially  including  non-automated  tactics)  is 
quantified  by  checking  the  reachability  reward  property: 

uman  =  ((sys))R^J=?[Fc  t=MAX_TIME] 

The  property  obtains  the  maximum  accrued  utility  value 
(i.e.,  corresponding  to  reward  rGU  -  Listing  10)  that  the  system 
player  can  achieve  until  the  end  of  execution  (t=MAX_TIME). 

Figure  2(a)  depicts  strategy  utility  analysis  results  for  the 
different  adaptation  strategies  in  a  scale  out  DCAS  scenario.  In 
the  figure,  a  discretized  region  of  the  state  space  is  projected 
over  the  dimensions  that  correspond  to  training  and  stress 
levels  of  a  human  actor  (with  values  in  the  range  [0,1]  and 
[0,10],  respectively).  Each  point  on  the  mesh  represents  the 
maximum  accrued  utility  that  the  system  can  achieve  on  a 
DCAS  SMG  model  instanced  for  a  time  frame  [0,200].  Tactic 
cost/benefit  values  and  the  utility  profile  employed  are  those 
described  in  Section  IV-A. 

If  we  focus  on  the  curve  described  by  values  in  the  lowest 
stress  level,  the  figure  shows  how  the  use  of  strategy  scaleOutOp 
attains  values  that  go  above  140  for  maximum  training, 
whereas  values  below  0.4  are  highly  penalized  and  barely  yield 
utility.  Moreover,  progressively  higher  stress  levels  reduce  the 
probability  of  successful  tactic  completion,  flattening  the  curve 
to  a  point  in  which  training  does  not  make  any  difference 
and  the  strategy  yields  no  utility.  On  the  contrary,  the  utility 
obtained  by  the  automated  scaleOut  strategy  (represented  by  the 
plane  in  the  figure)  is  always  constant  at  a  value  of  112.75, 
since  it  is  not  affected  by  willingness  or  capability  factors. 

2)  Strategy  Selection:  Given  a  repertoire  of  adaptation 
strategies  <S,  we  can  analyze  their  expected  outcome  in  a  given 
situation  by  computing  their  expected  accrued  utility  according 
to  the  procedure  described  above.  Based  on  this  information, 
the  different  strategies  can  be  ranked  to  select  the  one  that 
maximizes  the  expected  outcome  in  terms  of  utility.  Hence 


Fig.  2.  Experimental  results:  (a)  ScaleOut  vs  ScaleOutOp  strategy  utility  (left),  (b)  strategy  selection  (center),  and  (c)  combined  utility  (right). 


the  selected  strategy  s ^  can  be  determined  according  to: 

st  =  argmax«mM(s) 

s£»S 

where  wm£m(s)  is  the  value  of  property  um(m  evaluated  in 
a  model  instantiated  with  the  adaptation  logic  of  strategy  s. 

Figure  2(b)  shows  the  results  of  the  analysis  of  strategy 
selection  in  DCAS  scale  out.  The  states  in  which  human 
involvement  via  strategy  scaleOutOp  is  chosen  (~  45%  of 
states)  are  represented  in  black,  whereas  states  in  which  the 
fully  automated  strategy  scaleOut  is  selected  (~  55%)  are 
colored  in  white.  The  figure  shows  that  human  involvement  is 
only  advisable  in  areas  in  which  the  operator  has  a  stress  level 
of  8  and  below.  Progressively  higher  stress  levels  make  human 
involvement  preferable  only  when  also  progressively  higher 
training  levels  exist,  which  is  consistent  with  maximizing  the 
probability  of  successful  adaptation  tactic  completion.  In  any 
case,  for  training  levels  below  0.4,  human  actor  participation 
is  not  selected  even  with  zero  stress  level  (this  is  consistent 
with  the  function  op_f_c  in  Listing  7,  which  highly  penalizes 
poorly  trained  operators  -  op_trainingLevel  <  0.4). 

Figure  2(c)  shows  the  combined  accrued  utility  mesh  that 
results  from  the  selection  process  (i.e,  every  point  in  the  mesh 
is  computed  as  umau(s f)).  Note  that  the  minimum  accrued 
utility  never  goes  below  the  achievable  utility  level  of  the 
automatic  approach,  over  which  improvements  are  made  in  the 
areas  in  which  the  strategy  involving  human  actors  is  selected. 
The  average  improvement11  in  the  combined  solution  is  +8.93, 
corresponding  to  a  percentual  improvement  of  7.94%  over  the 
automatic  scale  out  approach,  and  7.81%  over  the  manual  one. 

VI.  Conclusion 

In  this  paper,  we  have  described  an  approach  that  employs 
formal  reasoning  about  human  involvement  in  the  adaptation 
loop  of  self-adaptive  systems  to  provide  a  systematic  approach 
to  human-in-the-loop  adaptation.  We  have  focused  on  the 
execution  stage  of  MAPE-K  systems,  in  which  human  actors 
adopt  the  role  of  system-level  effectors.  We  have  shown  how 

11  We  employ  the  delta  in  accrued  utility  as  an  indicator  of  the  improvement 
in  using  a  combined  solution  with  respect  to  a  manual/automated  adaptation, 

defined  as  A Umau  —  \umau(s+)  —  Umau (arg min Umau (s) ) | ,  with  S  = 

s£<S 

(ScaleOut,  ScaleOutOp}. 


to  incorporate  concepts  from  cyber-human  systems  (CHS) 
that  model  the  probabilistic  aspects  of  human  behavior  into 
a  language  specifically  tailored  to  describe  runtime  adaptation 
(Stitch).  Moreover,  we  have  shown  how  such  specifications  can 
be  encoded  into  stochastic  multiplayer  game  models  amenable 
to  analysis  via  model  checking.  We  illustrated  our  approach 
in  the  context  of  scale  out  adaptation  in  DCAS,  an  industrial 
middleware  for  monitoring  large  sensor  networks  in  renewable 
energy  power  plants.  Our  results  showed  that  our  approach  can 
help  discriminate  cases  in  which  the  involvement  of  human 
actors  in  execution  leads  to  an  improvement  of  system  utility, 
providing  the  basis  to  combine  human-based  and  automated 
adaptations  in  a  context-sensitive  manner. 

Concerning  future  work,  our  current  models  assume  that 
actors  and  system  are  working  in  coalition  to  achieve  goals. 
In  fact,  the  interaction  may  be  more  subtle  than  that;  Eskins 
and  Sanders  point  out  that  humans  may  have  their  own 
motivations  that  run  counter  to  policy  [17].  To  capture  this 
subtlety,  we  plan  on  extending  the  encoding  of  SMGs  to 
model  human  actors  as  separate  players.  Moreover,  we  will 
extend  our  approach  to  formally  model  and  analyze  human 
involvement  in  other  stages  of  MAPE-K,  studying  how  to 
best  represent  human-controlled  tactic  selection,  and  human- 
assisted  knowledge  acquisition.  Finally,  we  plan  on  performing 
live  experiments  with  human  actors  for  validating  our  results. 
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